Complete Parsimony Haplotype Inference Problem and Algorithms
نویسندگان
چکیده
Haplotype inference by pure parsimony (HIPP) is a wellknown paradigm for haplotype inference. In order to assess the biological significance of this paradigm, we generalize the problem of HIPP to the problem of finding all optimal solutions, which we call complete HIPP. We study intrinsic haplotype features, such as backbone haplotypes and fat genotypes as well as equal columns and decomposability. We explicitly exploit these features in three computational approaches which are based on integer linear programming, depth-first branch-and-bound, and a hybrid algorithm that draws on the diverse strengths of the first two approaches. Our experimental analysis shows that our optimized algorithms are significantly superior to the baseline algorithms, often with orders of magnitude faster running time. Finally, our experiments provide some useful insights to the intrinsic features of this interesting problem.
منابع مشابه
یک مدل ریاضی جدید برای مساله استنباط هاپلوتایپها از ژنوتایپها با معیار پارسیمونی
The haplotype inference is one of the most important issues in the field of bioinformatics. It is because of its various applications in the diagnosis and treatment of inherited diseases such as diabetes, Alzheimer's and heart disease, which has provided a competition for researchers in presentation of mathematical models and design of algorithms to solve this problem. Despite the existence of ...
متن کاملStochastic local search for large-scale instances of the haplotype inference problem by pure parsimony
Haplotype Inference is a challenging problem in bioinformatics that consists in inferring the basic genetic constitution of diploid organisms on the basis of their genotype. This information allows researchers to perform association studies for the genetic variants involved in diseases and the individual responses to therapeutic agents. A notable approach to the problem is to encode it as a com...
متن کاملHaplotype Inference with Boolean Satisfiability
Mutation in DNA is the principal cause for differences among human beings, and Single Nucleotide Polymorphisms (SNPs) are the most common mutations. Hence, a fundamental task is to complete a map of haplotypes (which identify SNPs) in the human population. Associated with this effort, a key computational problem is the inference of haplotype data from genotype data, since in practice genotype d...
متن کاملToward an algebraic understanding of haplotype inference by pure parsimony.
Haplotype inference by pure parsimony (HIPP) is known to be NP-Hard. Despite this, many algorithms successfully solve HIPP instances on simulated and real data. In this paper, we explore the connection between algebraic rank and the HIPP problem, to help identify easy and hard instances of the problem. The rank of the input matrix is known to be a lower bound on the size an optimal HIPP solutio...
متن کاملEfficient Haplotype Inference with Pseudo-boolean Optimization
Haplotype inference from genotype data is a key computational problem in bioinformatics, since retrieving directly haplotype information from DNA samples is not feasible using existing technology. One of the methods for solving this problem uses the pure parsimony criterion, an approach known as Haplotype Inference by Pure Parsimony (HIPP). Initial work in this area was based on a number of dif...
متن کامل